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Abstract 

Banyan networks comprise a large class of networks that have 
been used for interconnection in large-scale multiprocessors and 
telephone switching systems. Regular variants of Banyan net- 
works, such as delta and butterfly networks, have been used in 
multiprocessors such as the IBM RP3 and the BBN Butterfly. 
Analysis of the performance of Banyan networks has typically 
focused on these regular variants. We present a methodology for 
performance analysis of unbuffered Banyan multistage intercon- 
nection networks. The methodology has two novel features: it 
allows analysis of networks where some inputs are more likely to 
be active than others, and allows analysis of Banyan networks of 
arbitrary topology. 
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Introduction 

Banyan networks [2] comprise a large class of networks that have been used 
for interconnection in large-scale multiprocessors and telephone switching 
systems. A Banyan network is a network in which there is a unique path 
from each input to each output. 1 Regular variants of Banyan networks, such 
as delta and butterfly networks, have been used in multiprocessors such as 
the IBM RP3 [6] and the BBN Butterfly [7]. Analysis of the performance of 
Banyan networks has typically focused on these regular variants. 

Patel [5] presented a probabilistic analysis of the performance of delta 
networks. His work assumed that all sources transmit with uniform probabil- 
ity, and that all destinations are selected with uniform probability. Bhuyan 
[1] has extended Patel's work to include analysis of the case where each pro- 
cessor has a single favorite destination that is not the favorite destination 
of any other processor. Kruskal and Snir [3] have extended Patel's work by 
finding an asymptotic expression for the probability that a destination is 
receiving for networks with large numbers of stages. 

In what follows, we present a methodology for performance analysis of 
general unbuffered Banyan networks. The analysis allows us to compute ex- 
actly the probability of successful message transmission in a Banyan network 
of arbitrary topology, under several assumptions: 

1. The destination addresses for messages are unformly distributed over 
the outputs of the network. 

2. The messages presented at each input are independent of the mes- 
sages presented at other inputs, and also of messages presented on any 
previous cycle. 

3. The network is fully synchronous, with all messages not dropped at 
stage n proceeding simultaneously to stage n + 1 at each clock cycle. 

Our methodology has two novel features: it does not assume that all 
sources transmit with equal probability and thus allows analysis of networks 
where some inputs are more likely to be active than others; and it allows 
analysis of Banyan networks of arbitrary topology. 

Our work proceeds from the observation that all of the differing topolo- 
gies for unqueued Banyan networks can be decomposed into combinations 



1 Or from each base to each apex, in the terminology of Goke and Lipovsky. 



of three basic operations: bundling, concentration, and switching. By de- 
scribing the behavior of these primitive elements with a probabilistic model, 
we are able to evaluate the performance of any such network. 

We begin with a discussion of the use of probability mass functions to 
describe network wiring, and then consider the effect of each of three basic 
operations on these probability mass functions. Finally, we apply this model 
in an analysis of two switching elements, the common 2 k X 2 crossbar and 
the Transit RN1 switching element. 

Modeling Message Traffic With Probability Mass Functions 

A multi-stage network consists of a set of message sources, a cascaded set 
of network switching elements, and a set of message destinations. Often the 
set of source and destination nodes is identical. 

The elements comprising a switching network are interconnected with 
channels. Each channel consists of a wire or group of wires that are switched 
as a single unit. A channel might, for example, consist of a single bidirec- 
tional wire with serial encoding of messages, or a byte- wide data path with 
an associated parity bit. 

We associate with a channel a random variable / whose value is the num- 
ber of messages, or the load, that the channel is carrying. The probability 
mass function (PMF) of this random variable specifies for each non-negative 
integer j the probability that the channel is carrying j messages. We call 
this function the loading probability mass function (LPMF) for the channel. 
If the random variable specifying the load for a channel is called /, then we 
denote the LPMF for the channel pi(lo)- 

For example, a single channel has a probability p of carrying a message 
and a probability of 1 - p of being idle. Thus the LPMF for a single channel 
is simply the PMF of a Bernoulli trial. 

Our event space is the space of loading configurations for a particular 
network. That is, if we define a network as a set of message- carrying wires 
connected to each other by the switching elements we shall define below, then 
the elementary events in our event space are instances of this network with 
some load specified for each channel in the network. Obviously the N -f 1 
possible values of / for a channel that can carry a maximum of N messages 
partition the event space into N + 1 mutually exclusive sets of elementary 
events - each set containing all the network loading configurations for which 
the load on that channel is some given value. 

In later sections, we will find it useful to associate with a probability 



mass function its unilateral Z- transform. We denote the ^-transform of a 
PMF p x (x ) by pZ(z). 

Bundling 

The first operation is the simplest. We call the grouping together of several 
channels to form a single wider channel bundling. The single wider channel 
that is a product of bundling we sometimes call a bundle. The loads on the 
constituent channels in a bundle must be independent, as they will be in a 
Banyan network with independent inputs. When we bundle two channels, 
one of which can carry between and n messages and the other of which can 
carry between and m messages, the resulting channel can carry between 
and n + m messages. The loads of the channels being bundled are inde- 
pendent random variables whose sum we are forming, so that the LPMF of 
the resulting bundle will be the convolution of the LPMFs of the component 
channels. If we denote the bundling of a and b as B [p a (oo) »Pb(&o)] » we have 
simply 

B [Pa(ao) ,Pb{bo)] = Pa{ao) * Pb{ b o) 

where * denotes convolution. In the 2-domain, then, bundling will only 
require forming the product of the .Z-transforms: 

Z[B\pa(ao),p b {b Q )]]=i£(z).jg(z) 

Figure 1 depicts the result of bundling eight channels, each of which 
carries a message with probability 1/2. The LPMF is clearly that of a bi- 
nomial distribution, because the sum of independent identically distributed 
Bernoulli random variables is a binomial random variable. 

Concentration 

Our second elementary operation on channels is called concentration. In 
concentration, we take a bundle of M single channels and form from it a 
bundle of N single channels. If N < M, and the input bundle is carrying 
more than N messages, some messages will be lost. 

The effect on the LPMF of the input bundle is simple. If N > M, 
there is no effect on the LPMF. If N < M, the probability that more than 
N messages can be carried on the output bundle is 0, but in cases where 
messages are dropped, only enough will be dropped to bring the load to N. 



P{load=n} 
0.25 

0.2 
0.15 

0.1 
0.05 




Figure 1: Loading probability mass function for an eight-channel bundle, 
where each channel carries a message with probability 1/2. 



Thus the effect of the operation on an LPMF pi(lo) will be both to clip it 
to for l > N and to add to pi(N) the sum of pi(lo) for / > N. Figure 2 
shows the result of concentration on the LPMF of figure 1. 
More explicitly, if the input LPMF is given by 

M 
MM = £M(/o-*) 

»=0 

where 6(n) is the unit impulse function, the result of JV-concentration of 
Pl(lo), a bundle composed of M channels, to N channels, is given by 

C M , N \j>i(lo)} = Pi(lo)u(N-lo)+ ( £ Pi(h))S(l -N) 

\li=N+l ) 

where u(n) is the unit step function. 

If the .Z-transform of pi(lo) is pj{z), then we have 

M I M \ 

Z[C m ,n\pi(Io)]]=pT(z)- J2 «(M^+ E »('i) K 

h=N+l \h=N+l J 
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Figure 2: 6- concentration of the LPMF of figure 1. 



The first two terms in the transform are the result of taking the Z- 
transform of the truncated LPMF, and the last term adds in the ^-transform 
of the increased final element of the LPMF. Combining the last two terms, 
we have 

M 

Z[Cm,n\pi(Io)]]=I?(z)+ £ tt(/i)(*"-*' 1 ) 

l 1= N+l 



Switching 

The last operation we shall be using is switching. The switching operation 
is performed on an input bundle of N channels and specifies the LPMFs for 
two output bundles of N channels each. We designate the output bundles 
bundle and bundle 1. We specify for the modeled switch the probability 
(1 — q) that the each message in the input bundle is switched to to channel 
0; similarly, messages are switched to channel 1 with probability q. 

We now consider the LPMFs for the two output bundles, given the LPMF 
Pi(lo) of the input bundle. Suppose the input bundle is carrying i messages. 
What is the probability that j messages, where j < i, will be switched to 
channel 1? It is the probability of j successes in i Bernoulli trials. If we call 
the random variable specifying the load on the output bundle /,, we have 



for the conditioned probability 

Pi.nm= ('Va -<?)'-'' 

We may now apply the theorem of total probability to find an expression 
forp/,(;/'): 

N 

pi.ti) = S»(*)ft.i/(ilO 

t=0 

where the summation's lower bound is changed in the second expression 
because the probability that the output channel carries more messages than 
the input channel is 0. 

Thus if S specifies the probability of switching a message to a given 
output bundle, and the input LPMF is given by pi(lo), then the LPMF for 
the output bundle is given by 



sfo«o),s] = pi. (<<») = £ «(«•)(/ V'-(i - sy- l -° 



This can be interpreted as meaning that the probability that Z, messages 
appear on an output bundle is the probability that l so messages were on the 
input bundle and all Z, messages were switched to the given output bundle, 
plus the sum of the probability that /,„ + 1 messages were on the input 
bundle and exactly /, of these were switched to the given output bundle, 
and so on, up to the maximum possible load of the input bundle. 

An example of the effect of switching may be seen in figure 3. 

To evaluate the 2-transform of <S [pi(lo) ,5], we first note that the ran- 
dom variable describing the number of messages on an output bundle may 
be treated as the sum of a random number of identically distributed random 
variables. We can see this by imagining individually switching each channel 
in the input bundle to one output bundle or the other, before considering 
whether it is carrying a message. 

Then there is one random variable for each channel in the input bundle; 
call it b. 6 is 1 if the channel is switched to the output bundle being consid- 
ered, and if the channel is switched to the other output bundle. 6's PMF 
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Figure 3: The effect of switching the LPMF of figure 1 with probability 0.5. 



is then given by 



Mbo) = (1 - S)S(bo) + S6(b - 1) 



for S the probability of switching to the given output bundle. 

The random number of summands is the number of messages that the 
input bundle was actually carrying; thus its distribution is the LPMF of the 
input bundle. To extend our earlier interpretation, we are saying here that 
the load on a particular output bundle is the number of occupied channels 
in the input bundle that were switched to that output bundle. 

Now the ^-transform of an output channel's LPMF is given by the trans- 
form of the sum of a random number of identically distributed random vari- 
ables: 



2[S[pK/o),S]] = pf(d(z)) 

= pf(l-S + Sz) 

We note that, where the probability of switching to both bundles is equal, 
the 2-transform for the LPMF resulting from repeated stages of switching 
has a particularly simple form. If b is the random variable representing the 



number of channels switched to an output bundle, we have 



rf(z) = Z 



S(b ) + S(b - 1) 



z + 1 



We can now follow the rule given above, so that n levels of switching cause 
repeated substitutions for z, and we have the recurrence relation 

■»(») = 2 

5(0) = z 

with solution 

z + 2" - 1 
S(n) = 

Thus, if U is the random variable for the load on the input bundle and 
/„ is the random variable for the load on an output bundle after n levels of 
binary switching with uniform probability of switching to either channel, we 

have 

T, x T f z + 2 n -l \ 

Pln( Z ) = ft. { ^ J 

Descriptions of Simple Switching Elements 

We describe two simple switching elements, the 2 k X 2 k crossbar and the 
Transit RN1 switching element, by using combinations of our three prim- 
itive operations: bundling, concentration, and switching. In depicting the 
primitive operations schematically, we use the symbols shown in figure 4. 

The 2 k x 2 fe Crossbar 

The common 2 fe X 2 k crossbar network is formed by bundling the 2 k inputs, 
switching k times (once per bit of routing data), and concentrating the out- 
puts with an 2 fe -input, one-output concentrator. We depict the probabilistic 
model of an eight-by-eight crossbar in figure 5. 

For a 2 k X 2 k crossbar, we may find an output channel's LPMF as follows. 
If we call the probability that an input channel is transmitting Q,, the LPMF 
for an input channel is given by 

Py(vo) = QiS(yo-l) + (l-Qi)S{yo) 
with Z -transform 

Py(z) = QiZ+(l-Qi) . 

8 
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Figure 4: (a) The symbol for bundling two input bundles into one. (b) 
The symbol for concentrating j channels to k channels, (c) The symbol for 
switching with probability q to the top output channel, and 1 — q to the 
bottom output channel. 
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Figure 5: An eight-by-eight crossbar network. 



The LPMF for a bundle of 2 fe identical input channels has ^-transform 

A(*)=(i?(*)) a ' 

The result of k stages of switching with equal probability in each of two 
directions is 



£(*) = 



tk 



2 k 



We note that this Z-transform is trivially invertible, so that, setting 
M = 2 k and rearranging slightly, we have 

*M-(2fngp)(s-o , ' , i 

^■(S) M g(r)(5-o'^-'— ») 

Now we may perform the Af — > 1 concentration. Because this is a 
concentration to one channel, we may save some work by noting that we can 
simply consider the loading probability for zero messages from the LPMF 
above; the concentration forces all other loading probabilities to that for one 
message, which will necessarily be the complement of the loading probability 
for zero messages. We have, for the loading probability for zero messages, 

-Mr(gp)(i-')v<-o)) 

Note that the terms in the summation are nonzero only where / = M, so 
this expression simplifies to 
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M / M X M 



The LPMF of an output channel is then given by 

As the number of stages k in the crossbar increases, M = 2 k becomes large 
quickly, and pi{l) quickly approaches the limit 

& B (l)-Jm(l-(l -§)")-(! -«-«.) 

In our analysis, the probability of successful message transmission is 
given by the ratio of the expected number of messages transmitted by all 
the output channels to the expected number of messages on input channels. 
In the case of a square crossbar network, this is simply Pi(l)/Qi- We plot 
this value as a function of Qi, the input loading on the network, for an 
eight-by-eight crossbar network in figure 6. 

The Transit RNl Switching Element 

The RNl switching element is a prototype for the switching element to be 
used in the Transit interconnection network for massively parallel computers, 
being built by the Transit Group at MIT's Artificial Intelligence Laboratory. 
The RNl switching element can be configured in one of two ways; the first is 
as two four-by-four crossbars, and the second is as an eight-by-four crossbar 
with a dilation of two, meaning that only four logical output directions are 
available, but two messages can be carried in each. It is the second of these 
configurations whose performance we analyze. We depict the element in 
figure 7. 

The derivation of the LPMF of a two-channel output bundle for the RNl 
switching element follows. As above, if we call the probability that an input 
channel is transmitting Qi, the LPMF for an input channel is given by 

Py(yo) = QiS(y - 1) + (1 - Qi) %o) 
11 
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Figure 6: The probability of successful message transmission as a function of 
Qi, the input loading on the network, for an eight-by-eight crossbar network. 
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Figure 7: The RN1 switching element, in the eight-by-four, dilation two 
configuration. 
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with ^-transform 

pj(*) = Qiz + (i - Qi) 

The LPMF for a bundle of 8 identical input channels has ^-transform 

£(*) = (if (*))" 

The result of two stages of switching with equal probability in each of two 
directions is 



rf.w = (rf^))' 



= (fl.^ + d-fc))' 

Again as in the case of the crossbar, we invert the transform 



P-. 



"<->-(*)'S(9(^o , <--<«-'>) 



and then perform the concentration. In this case the concentration is to 
two channels, so that we must consider probabilities for the two cases that 
the output bundle carries zero messages or one message in order to use the 
method we did before for deriving the concentrated LPMF. 

For zero messages, the sum is zero except where / = 8, so that we have: 

For one message, the sum is zero except where 1 = 7, so we have: 

13 




After concentration, the probability for two messages must be the sum 
of the probabilities for higher loads, so that the LPMF for a two-channel 
output bundle is given by 

p,(/ )=(l-^)Vo) + 2Qi(l-Qf) 7 6{lo-l) 

♦ H-f)'(-f)K*> 

We form the probability of successful message transmission as the ratio 
of the expectation of the number of messages on all output channels to 
the expectation of the number of messages on all input channels. In this 
analysis we have assumed uniformity and independence of input loading and 
a uniform distribution of message destinations, so that the expectation of 
the input loading is simply X^n=i Qi = &Qi an ^ tnat °f * ne output loading 
(if we recall that the random variable giving the number of messages on an 
output bundle is /) is 



E I «,=4(i.( 80 ,(i-ai)') +a .( 1 _( 1 _fi.y( 



Thus the probability of successful message transmission is given by 
fr(l-») T + l-(l-») 7 (l + 2 ft) 

Qi 

i+(Q,-(i + ^))(i-^) 7 

Qi 
i-(i + ^)(i-%) 7 
Qi 

We plot the probability of successful message transmission versus the 
input loading in figure 8. 

Analyzing the Performance of More Complex Networks 

It may be difficult to simplify the expressions describing more complex net- 
works built from arrays of simple switching elements like those we have 
analyzed above. Indeed, Patel [5] and Kruskal and Snir [3] derive expres- 
sions only for simple, regular networks; these are a n X b n delta networks in 

14 
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Figure 8: Probability of successful message transmission plotted vs. input 
loading for the Transit RN1 switching element in its eight-by-four, dilation 
two configuration. 

the case of Patel's work, and square Banyan networks of arbitrary dilation 
in the case of Kruskal and Snir's work. 

The advantage of our approach is that such analyses are automated. One 
specifies the topology of the network, forms the sequence of operators that 
describes the loading probability mass function for an output bundle, and 
evaluates it. This evaluation can be numeric or in the form of a parameter- 
ized expression. The derivation of such an expression for a complex network 
is aided by the use of a symbolic mathematics program like Macsyma or 
Mathematica. We present a set of Mathematica functions that may be used 
for such analysis in the appendix. 

Future Work 

The methodology described above, despite its advantages, does not yield 
a completely satisfactory model of a Banyan network's performance. We 
describe now some of the disadvantages of the methodology. 

The probability of successful message transmission alone will not be a 
faithful measure of performance in a network that is buffered. In the Transit 
network, for example, although the individual switching elements themselves 
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do not contain buffers, messages are effectively buffered at the inputs to the 
network. Thus it will be desirable to extend the methodology with queueing- 
theoretic techniques to allow the creation of more faithful models. In future 
work, we hope to do this in a fashion that continues to allow analyses of 
Banyan networks of arbitrary topology. 

While the methodology does allow analysis of networks where one or 
more sources are likely to be more active than others, it does not easily lend 
itself to an analysis of networks where one or more sinks are more likely to 
be the destination of messages than others. In fact, the general case of this 
problem, where messages entering a Banyan network of arbitrary topology 
have an arbitrary distribution of destination addresses, remains unsolved to 
date. 

The analysis technique described is appropriate only to Banyan net- 
works. While these constitute a large class, some of the fault-tolerance 
features of Banyan networks used in practice may include redundant paths 
between sources and sinks. It will be necessary to extend the technique to 
encompass replications and dilations of Banyan networks; in more compli- 
cated cases, it may be necessary to supplement it with a different approach, 
or abandon it altogether. 

Another disadvantage of the methodology we have described lies in its 
tacit assumption that the network modeled is completely synchronous. This 
assumption is not always justified; for example, in the case of a circuit- 
switched network like the Transit network, successful message transmission 
creates a circuit which is held open until a reply is sent. The circuit is held for 
a number of cycles, during which other messages may be transmitted from 
the inputs and be blocked because paths at succeeding stages are already in 
use. 

A related disadvantage of our methodology lies in its assumption that 
the path being built by a message being transmitted in a circuit-switched 
network immediately disappears, freeing all associated resources, if the mes- 
sage is blocked, whereas in reality it will take a number of cycles for these 
resources to be freed. Nussbaum and his colleagues have found this to be a 
significant factor in discrepancies between Patel's model and their simula- 
tion, as described in in [4]. 

We hope to address some of these disadvantages in ongoing work. The 
ideal result would be a technique for automatically generating an accurate 
model of the performance of a multistage interconnection network given only 
a description of the network topology. 
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Appendix: Mathematica Functions for Banyan Network Anal- 
ysis 

concentrate: : usage = 

"concentrate [x, n] concentrates the LPMF x to n channels." 

concentrate [x_ , n_] := 

(* get distribution for through n-1 channels, and add 

as last element the sum of the rest of the channels. *) 
Append [Take [x , n] , Apply [Plus, Drop[x, n]]] 

discreteconvolution: :usage = 

"discreteconvolution[x, y] treats x and y as 0-based 
vectors and returns their discrete convolution." 

discreteconvolution [x_, y_] := 
Block [{xlgth, ylgth, lgth}, 
xlgth = Length [x] ; 
ylgth = Length [y] ; 
lgth = xlgth + ylgth - 1; 

(* in summation, portions of sequence with indices 
out of range for sequences must be treated as 
0. *) 
Table [Sum [If [k < 1 II k > xlgth || 

(n-k+1) < 1 I I (n-k+1) > ylgth, 

0, 

(* because of the 0->l index 

translation, we increase the y-index 
to shift the result sequence back 
down to begin at 1. *) 
x[[k]] y[ [n-k+1]]], 
{k, xlgth}], 
-Cn, lgth}]] 

bundle: : usage = 

"bundle [x, y] forms the LPMF that results from bundling 
two input bundles with LPMFs x and y." 

bundle [x_ , y_] : = 

discreteconvolution[x, y] 
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switch: : usage = 

"switch [x, p] returns the LPMF of an output bundle to 
which x is switched with probability p." 

switch [x_, p_] := 
Block [{lgth}, 

lgth = Length [x] ; 

Table [Sum [x[[i+i]] Binomial[i, n] p"n (l-p)"(i-n), 
{i, n, lgth-1}], 
-Cn, 0, lgth-1}]] 
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